Psychoacoustic Segment Scoring for Multi-Form Speech Synthesis

نویسندگان

  • Alexander Sorin
  • Slava Shechtman
  • Vincent Pollet
چکیده

In multi-form segment synthesis, output speech is constructed by splicing waveform segments with statistically modeled and regenerated parametric speech segments. The fraction of model-derived segments is called model-template ratio. The motivation of this work is to further increase flexibility of multi-form synthesis maintaining high speech quality for high model-template ratios. An approach is presented where the representation type of a segment is selected per acoustic leaf. We introduce a novel method for leaf representation selection based on a psychoacoustic segment stationarity score. Additionally, refinements in multi-form segment concatenation including boundary constrained statistical parametric synthesis and time-domain alignment based on multi-peak analysis of cross-correlation for high modeltemplate ratio multi-form synthesis are presented.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Uniform Speech Parameterization for Multi-Form Segment Synthesis

In multi-form segment synthesis speech is constructed by sequencing speech segments of different nature: model segments, i.e. mathematical abstractions of speech and template segments, i.e. speech waveform fragments. These multi-form segments can have shared, layered or alternate speech parameterization schemes. This paper introduces an advanced uniform speech parameterization scheme for statis...

متن کامل

Spectral smoothing for concatenative speech synthesis

This paper addresses the topic of performing e ective concatenative speech synthesis with a limited database by proposing methods to smooth the transitions between speech segments. The objective is to produce naturalsounding speech via segment concatenation when formants and other spectral features do not align properly. We propose several methods for adjusting the spectra between waveform segm...

متن کامل

Refined inter-segment joining in multi-form speech synthesis

In multi-form speech synthesis, speech output is constructed by splicing waveform segments and parametric speech segments which are generated from statistical models. The decision whether to use the waveform or the statistical parametric form is made per segment. This approach faces certain challenges in the context of inter-segment joining. In this work, we present a novel method whereby all n...

متن کامل

A comparison of spectral smoothing methods for segment concatenation based speech synthesis

There are many scenarios in both speech synthesis and coding in which adjacent time-frames of speech are spectrally discontinuous. This paper addresses the topic of improving concatenative speech synthesis with a limited database by proposing methods to smooth, adjust, or interpolate the spectral transitions between speech segments. The objective is to produce natural-sounding speech via segmen...

متن کامل

Transitional speech segments modeling by matching pursuit with a dictionary based on the psychoacoustic adaptive WP

In this paper transitional speech segments modeling by matching pursuit is proposed. The dictionary for matching pursuit is composed of wavelet functions that implement of psychoacoustic adaptive wavelet filter bank. Psychoacoustically motivated entropy based cost functions allow to greatly minimizing a number of time-frequency atoms in wavelet packet (WP) dictionary. The given transient modeli...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012